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Mutated Nucleic Acid of a Cel I-Endonuclease and Method for producing the 
Recombinant, Full-length Cel I-Protein 

The invention relates to a method for producing a recombinant, complete CEL I-Protein, a 
plant endonuclease, or parts thereof, by the expression of synthetic DNA sequences. The 
invention also relates the DNA sequences themselves, which are produced for this purpose. 
Furthermore, the invention relates to the use of the recombinant^ produced CEL I-enzyme 
for detecting point mutations as well as larger mutated regions like e.g. deletions/insertions. 

The CEL I-enzyme is an endonuclease found in celery (Oleykowski et al. 1998), which 
recognises single "uneven elements" within the DNA-double helix and cleaves there in a 
specific manner. The enzyme therefore constitutes a very useful means for detecting 
mutations. It also specifically recognises single base mismatches (point mutations) and 
cleaves at one of the two DNA-strands proximately 3' from the. mutation, thereby incising 
into the double strand. 

For the application desired in this context, CEL I provides a number of advantages in 
comparison to other known nucleases also being able to incise into DNA-strands at uneven 
elements of the double helix structure: 

The major problem of the presently known mismatch-recognising endonucleases, some of 
which are also commercially available, is based on their incomplete capability to identify all 
possible types of mismatches or mutations and on an unspecific DNA-degradation produced 
by them. As an example, the SI nucleases do not cut at singles base mismatches (Loeb and 
Silber, 1981). The Mung fcearc-Nuclease provides an efficiency being five times higher at a 
pH of 5, than e.g. in the neutral pH-range (Kowalski and Sandford, 1982). The T4- 
Endonuclease VII does not only cut one strand of a double helix, but always cuts the 
complementary strand as well (Solaro et al., 1993). Furthermore the endonuclease isolated 
from T4-phages was shown to provide an unspecific activity of random DNA-degradation 
being significantly higher than that of the CEL I-enzyme, which is synthesised by the present 
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method (Cotton et al., 1999). Moreover, the degree of specificity of T4-endonuclease VII 
exhibits a high dependence on the length of the substrate and also on the sequence 
surroundings of the mismatches to be detected (Babon et al., 1999; Norberg et al., 2001). 

This has to be regarded as a considerable disadvantage especially for a specific selection of 
mutations being unknown at first - a feature, that is of special relevance for the application 
described herein. 

CEL I belongs to a distinct group of nucleases, which are found in many plant species and 
which are especially characterised by their specific maximum of activity at neutral pH value 
(Oleykowski et al., 1998), although an activity of CEL I is also to be found in a range of pH 
values between pH 5 and pH 9,5 (Oleykowski et al., 1998). 

The capability to recognise and cut each form of base mismatches also irrespective of the AT- 
content in the vicinity of the mutation distinguishes the isolated CEL I-enzyme from other 
nucleases belonging to the family of plant nucleases and having their activity peak at a neutral 
pH value, like e.g. SP nuclease isolated from spinach (Yang et al., 2000; Oleykowski et al., 
1999). 

For reason of the toxic character of CEL I, it is until now impossible for cells without 
compartmentation to successfully express CEL I in a recombinant form, thus prohibiting to 
yield CEL I that way without the necessity to laboriously purify it in a native from the plant. 

It is thus one objective of the present invention, to produce enzymatic material in the fonn of 
(the purest possible) active CEL I-enzyme in arbitrary amounts and by means of simple 
production methods, thereby providing sufficient amounts of this enzymatic material. 

This objective of the invention is achieved by a special modification of the common DNA 
sequence of Cel I, this modification being especially designed for this aim. A further aspect of 
the present invention is thus this newly designed sequence. 

A further aspect of the present invention refers to a method for producing the recombinant, 
complete CEL I-protein comprising the following steps: At first, a scheme of the DNA- 
sequence to be synthesised is created. This scheme is based on the cDNA sequence of the 
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CEL I-protein isolated from the celery plant Apium graveolens L. (see Olekowski et al., 
2000). 



For an expression in yeast, this DNA sequence is redrafted in dependence of the common 
yeast codon frequency (codon usage) while retaining the amino acid sequence of the CEL I- 
protein (FIG. 1). By the teaching of the present invention, an expression of Cel I in the yeast 
Pichia pastoris was thus allowed or favoured for the first time. A method for optimising 
codon usage is e.g. described by Outchkourov NS et al. in the case of the protein equistatin of 
the water anemone. This document neither reveals nor suggests the use of this method for non 
expressible enzymes with toxic effects on the cell. 

Concretely, the invention thus relates to a method for producing a nucleic acid sequence, 
which codes for the complete CEL I-protein and allows to be recombinantly expressed in host 
cells, whereat the method comprises the following steps: providing the sequence coding for 
the CEL I-protein from a suitable organism, in particular from Apium graveolens L., and 
adequately modifying the codon frequency of the sequence to be expressed in comparison to 
the native sequence, whereat this modification is performed with regard to the host organism 
to be used for expression. This modification is accomplished by the following steps: a) 
partition of the planned sequence into an even number x, in particular 8, overlapping regions, 
b) synthesis of 2x mutated oligonucleotides, in particular oligonucleotides 1-16 to 16-16, each 
of which comprises the entire length of one overlapping region of both strands of the coding 
sequence, c) first PCR-amplification in order to produce x/2, in particular 4, overlapping 
fragments under employment of the oligonucleotides of step b), d) second PCR-amplification 
in order to produce x/4, in particular 2, overlapping regions under employment of the 
fragments of step c), and, e) third PCR-amplification in order to produce x/8, in particular 1, 
fragment, which comprises the coding region of CEL I. A schematic overview of a preferred 
embodiment is depicted in FIG. 3 . 

As an alternative, a method according to the invention may also be characterised in that 
instead of step e) the following steps are performed: e') Cloning the fragments generated in 
step d) into a suitable vector, and f) appropriately digesting the vectors and ligating the 
fragments in order to fuse them to form a complete "fragment" comprising the coding region 
of CEL I. Thereby, it is possible to perform further modifications of the sequence harboured 
in vector systems, if desired. Respective methods for this aim are familiar to the expert in the 
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field of nucleic acid applications and are described - among others - in the standard works 
"Molecular Cloning - A Laboratory Manual, Sambrook & Russel, 3 rd Edition, (2001), Cold 
Spring Harbor Laboratory Press, or "Current Protocols in Molecular Biology", Ausubel et al. 
(1994 pp), Harvard Medical School. 

Preferred is a method according to the invention, in which the codon frequency of the 
sequence to be expressed is modified according to the codon frequency of yeast, since yeast 
offers special advantages as a host organism for expression (see below). 

Particularly preferred is a method according to the invention, which is characterised by the 
further attachment of nucleotides coding for present/additional N-terminal or C-terminal 
amino acid-tags, in particular for tags being comprised of 6 histidines. If necessary or useful, 
a base sequence in the form of a "His tag" can be added C-terminally or, in another case, N- 
terminally to the original sequence, this sequence coding for 6 histidines. Due to their strong 
affinity to nickel ions these tags are intended to support the purification of the expressed 
protein by means of immobilised nickel molecules, if desired. Paula de Mattos Areas et al., 
e.g. describe - among other things - the use of such "His-tags" in Escherichia coll The 
fragment, being provided with a His-tag sequence at its N-terminus, has a cleavage side for 
factor Xa 3' from the tag sequence. This allows to subsequently process the expressed enzyme 
by cutting off the His-tag sequence being potentially obstructive for enzyme activity. The 
sequence designed such, will be designated in the following as 6His-Xa-Cel I sequence. The 
fragment being C-terminally linked to the His-tag will be designated as Cel I-6His in the 
following description of the present invention. 

In order to facilitate the several steps of the method, the oligonucleotides synthesised in step 
(c) have an average length of 70 nucleotides and overlap in each case at about 20 bases. These 
values however, have to be understood as mere clues and may vary in dependence of the 
intended use. Suitable variants are easy to find for the expert on the basis of reactive or kinetic 
parameters; they are particularly dependent on the temperature and the base sequence. 

A further aspect of the invention relates to a method for producing a recombinant, complete 
CEL I-protein from Apium graveolens L., comprising a) performing the above described 
method and b) expressing the nucleic acid sequence by means of a suitable expression system. 
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Especially preferred for this aim are vector-based expression systems being selected from the 
pPIC 9, pPIC 3, 5 and pQE-vectors. However, one may also employ other expression systems, 
which are familiar and depending on the host strain used. 

A preferred option thereby is the expression of nucleic acid sequences in a host cell, which is 
selected from Hansenula polymorphs Pichia pastoris, Saccharomyces cerevisiae, HeLa- 
cells, CHO-cells, Toxoplasma gondii and Leishmania. Particularly preferred is a method, in 
which the employed Pichia pastoris strain is the stain GS 1 1 5 . 

The invention also relates to the complete DNA sequence of the CEL I-protein or expressible 
parts thereof derived from Apium graveolens, whereat said sequence is adapted for expression 
and is provided by a method according to the present invention. The wording "expressible 
parts" thereof refers to parts of the nucleic acid sequence coding for a polypeptide chain 
having an enzymatic function, in particular the function of the native CEL I-enzyme. Also 
comprised by the scope of the invention however, are nucleic acids coding for epitopes. 

A preferred form is represented by a DNA sequence according to the invention, which codes 
for the Apium graveolens CEL I-protein, this sequence being characterised in that furthermore 
nucleotides are added, which encode additional N-terminal or C-terminal amino acid-tags, in 
particular tags being comprised of 6 histidines. In an also preferred form, the sequence at its 
both ends is equipped with restriction endonuclease cleavage sites, which are absent in the 
remaining sequence and in a vector to be employed. A most preferred Apium graveolens CEL 
I-protein encoding sequence according to the invention or parts thereof is presented in SEQ 
ID No.7. "Parts" in this context especially mean the fragments serving as probes, but also the 
overlapping oligonucleotides used for generating and cloning (see FIG. 2). 

Both CEL I coding sequence variants were equipped at their ends with sequences of 
restriction sites, which are helpful for subsequent cloning steps. Examples for these sequences 
are the EcdKL restriction site positioned at both ends or the Xhol restriction site at the C- 
terminus (FIG. 2). Moreover, the translational start ATG was integrated in a Kozak sequence 
(consensus sequence for translational initiation) (ACC ATG G) (Kozak 1987; Kozak 1990) in 
both sequences. In the further procedure, 16 deoxyoligonucleotides were synthesised, which 
correspond to the planned sequence and completely cover the whole length of the respective 
cDNA. The deoxyoligonucleotides were synthesised such, that their sequences were 
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alternatively corresponding to the 5*-3'- or to the complementary 3'-5'-DNA-strand. The 
length of the deoxyoligomicleotides was between 40 to 93 bases with overlaps of an average 
of 20 bases between neighbouring sequences (FIG. 2). The artificial CEL I-gene was 
synthesised in the form of two independent partial fragments, the N-tenninal and the C- 
terminal fragment, and fused afterwards via a HindBI restriction site (FIG. 2). 

The generation of a partial fragment was accomplished according to the following principle: 
In a first step, four DNA sequences having the double length of two neighbouring 
deoxyoligonucleotides (minus the overlapping sequences) were generated for each fragment 
by means of asymmetric PCR. The amplification was accomplished such, that the 
neighbouring, accumulated DNA-strands in each case represent the opposite strand. 

In a second amplification step, secondary products having a length of four original 
oligonucleotides (minus the overlapping sequences) can be synthesised (Fragments E and F, 
FIG. 3). 

Another conversion of the such produced products into preparations mainly consisting of 
single-stranded DNA during an asymmetric PCR reaction with terminal oligonucleotides 
allowed for producing a double-stranded DNA-fragment of about 400 bp (fragment G; FIG.3) 
by means of two further PCR-amplifications. 

The two such synthesised fragments were subsequently cloned in E. coli and their sequences 
determined. Erroneous sequence sections were excised with appropriate restriction enzymes 
and replaced by the corresponding correct sequence part of another clone. After having 
generated the N-terminal and the C-terminal fragment of the CEL I coding DNA-region in a 
correct form, both partial fragments were fused via the above cited HindKL restriction site in a 
suitable vector and cloned in E. colt 

Preferably, the artificial CEL I-gene generated in such a way can be transferred into suitable 
expression vectors during subsequent steps of the procedure. Favourable examples for these 
vectors are - among others - expression vectors suitable for the Pichia pastoris expression 
system like the pPIC 9 or the pPIC 3, 5 vector (Invitrogen). Also possible however, are other 
expression vectors and host organisms other than yeast, which are familiar to the expert. The 
subject of the invention is thus not restricted to a special host system. 
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A further aspect of the present invention thus relates to a host organism, which is capable to 
integrate and express a DNA sequence according to the invention. This host preferably is 
selected from Hansenula polymorphs Pichia pastoris, Saccharomyces cerevisiae, HeLa- 
cells, CHO-cells, Toxoplasma gondii and Leishmania. Especially the Pichia pastoris stem 
GS1 1 5 is used. However, plant cells or insect cells may also be employed. 

A preferred host organism in this invention, which is employed for expression, is the yeast 
Pichia pastoris (Invitrogen), whereof the preferred yeast strain is GS115 (Invitrogen). Yeast 
in general is preferred as the expression system, since it has - as a eukaryotic organism - many 
advantages compared to bacterial systems for expression, like e.g. the post-translational 
processing of proteins. Another important advantage of using a eukaryotic expression system 
is based on the cellular compartmentation being present in eukaryotic organisms. Expressing 
nucleases by means of recombinant expression systems in prokaryotic host organisms like 
bacteria is toxic for the cells due to the nucleases' DNA-degrading properties and has 
consequently and several times been described as being extremely difficult (Golz et a., 1995; 
Kosak and Kemper, 1990). 

Pichia pastoris moreover is able to metabolise methanol as a hydrocarbon source. Thereby, 
the first step of methanol catabolism is catalysed by alcohol oxidase. Pichia harbours two 
genes, which code for this enzyme, the AOX1- and the AOX2-gene, whereat the AOXl-gene 
provides the by far greater portion of active alcohol oxidase in the cells. The expression of the 
AOXl-gene is regulated and induced by methanol. For the preferred expression system of this 
invention, the AOXl-gene was isolated and the AOXl-promotor was used for the expression 
of an arbitrary gene (Ellis tit al., 1985; Koutz et al., 1989; Tschopp et al., 1987a). 

The form of heterologous expression of the CEL I-enzyme in Pichia pastoris being preferred 
in this invention is the secretory form of protein expression. Secretory protein expression in 
Pichia pastoris has the advantage, that - because of the very low level of native protein 
secretion of this yeast - the major component of the total protein in the medium is constituted 
by the desired protein. This facilitates further steps of purification of the heterologous protein 
or even makes them potentially unnecessary. 
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The secretory mechanism preferably used in this expression method is based on the secretion 
signal cc-factor of Saccharomyces cerevisiae (Barr et al., 1992), which is already integrated in 
the prefabricated expression vector pPIC 9 (Invitrogen). 

A further reason for the preference of the Pichia systems e.g. to prokaryotic expression 
systems is the capability of Pichia pastoris to perform post-translational modification like e.g. 
the N-glycosidic affiliation of sugars, but without causing hyperglycosylation like it is e.g. the 
case with S. cerevisiae (Grinna and Tschopp, 1989; Tschopp et al., 1987b). Post-translational 
modifications can be crucial for the proper function of an enzyme. 

The construct, which is preferred in this invention for the expression of the active CEL I- 
enzyme consists of the Cel I-6His-sequence-molecule, which is ligated in the appropriate 
orientation into the EcdRI restriction site of the expression vector pPIC 9, and which in this 
form has its open reading frame in fusion with the signal peptide. By means of cloning into 
the EcdRI restriction site of the vector, the CEL I-gene, preferably the Cel I-6His-construct, is 
put under the control of the AOX1 -promoter being positioned 5 'of the construct. For cloning 
purposes in E.coli, the vector provides both an ampicillin resistance and an Ecoli origin of 
replication (FIG. 4). 

It is necessary to integrate the gene into the yeast genome in order to achieve expression of 
the CEL I-gene. The integration, in the case being preferred herein, in accomplished by a 
homologous recombination, i.e. by a crossing over between the His4-locus on the 
chromosome and the His4-locus on the vector. The His4-gene of Pichia pastoris is used for 
the selection of stable transformants. For this purpose, the His4-gene, which is part of the 
histidine metabolism pathway, is present in the yeast genome in a mutated form, whereas it is 
present in the vector in the wildtype form. 

For this reason, yeast cells without the integrated vector are not capable to grow on histidine- 
free medium, whereas yeast cells successfully transformed with pPIC9 are capable to form 
colonies on a medium containing no histidine. Since the vector lacks a yeast origin of 
replication, only yeast cell colonies can arise, in the founder cell of which a recombination has 
taken place between the plasmid and the yeast genome whereby the vector including the 
target gene has been integrated into the yeast genome. 
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The preferred technique of transformation in this method is the yeast transformation by means 
of electroporation. For this technique one adds 20-30 ug of linearised vector-DNA, purified 
by phenol extraction after linearisation, to 80 ul freshly competent cells of the yeast strain 
GS115 in a sterile cuvette. The CEL I-6His-pPIC9 construct in this method is linearized via 
the unique Sail restriction site (FIG. 4). 

The electroporation was performed by means of a Gene Pulser II Systems (Biorad) employing 
50 MF/200Q/1 ,8V and a pulse time of about 1 0 msec. 

Successfully transformed cells can be identified after an incubation period of 5 days on a 
histidine-free medium as properly grown colonies. Further evaluation in respect of a stable 
transformation was accomplished by a PCR-based detection of the target gene within the 
yeast genome. 

As a starting primer for the PCR, we used a CEL I-specific primer Testf: 5'- 
ATGACC AGACTGTACTCCGTGTTC-3 ' (SEQ ID No.l; FIG. 2) and as an opposite primer, 
we used a primer being complementary to the vector cassette, the primer AOX3': 5'- 
GC AAATGGC ATTCTGACATCC-3 ' (SEQ IDNo.2; FIG. 4). The size of the expected PCR- 
product was about 1000 bp. As a positive control, the vector construct Cel I-6His-pPIC9 was 
used as a template. As negative controls, we used both genomic yeast DNA of a clone being 
transformed with the parental vector pPIC9 without a CEL I-insert, and a sample without a 
template in order to exclude "false positives" arising in consequence of contaminations. " 

In an example given herein for the preferred method, 16 clones of 20 yeast clones tested were 
identified as unambiguously positive (FIG. 5). For additional certainty in respect of the 
integration of the target gene into the yeast genome, PCR-positive clones were analysed by 
hybridising a respective Southern Blot with a CEL I-specific probe. Also here the controls 
used were a plasmid-DNA of the CEL I-6His-pPIC9-construct (positive control) and a 
genomic DNA being transformed with the parental vector pPIC9 without the CEL I-insert into 
yeast (negative control). 

In one example given herein for the preferred method, a digoxigenin-labelled probe with a 
length of 262 bases was synthesised. The probe synthesis was accomplished by means of two 
oligonucleotides specifically annealing in the N-terminal reeion of the codine CEL T- 
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sequence, the oligonucleotides "Sonde f* ("probe f") 5*- 
. ATGACC AGACTGTACTCCGTGTTC-3 * (SEQ ID No. 3) and "Sonde f ("probe r") 5'- 
GTC AGGGGTATCAATGAAATGTAA-3 • (SEQ ID No.4; FIG, 2). 

In one example given herein for the preferred method 10 clones of 12 clones being tested as 
PCR-positive were again confirmed to be positive by means of Southern hybridisation (FIG. 
6). Clones being tested as undoubtedly positive in both test methods were used for expression. 

The expression of the gene CEL I under the control of the AOX1 -promoter (Ellis et ah, 1985; 
Koutz et al, 1989; Tschopp et al., 1987a) was realised by the repressing/derepressing 
mechanism and subsequent induction, as described for Saccharomyces cerevisiae (Johnston, 
1987). The exact employment in Pichia pastor is was realised according to the protocol of the 
Pichia expression system (Invitrogen). 

Cultivation and growth of the clones was realised for two days in a medium containing 
glucose. Glucose acts as a repressor of the genes being controlled by the AOXl-promotor, 
thus blocking their transcription. After having reached a cellular density of OD600 - about 3-5, 
expression was induced by changing the medium and adding the inductor methanol at a 
cellular density of OD600 = 1. Expression was allowed for a period of 7-8 days with an 
addition of the metabolised carbon source methanol every 24 hours. 

A further aspect of the invention refers to a recombinant, complete CEL I-protein produced by 
a method according to the invention. 

Since the expressed protein is secreted in the preferred example for expressing CEL I, the 
desired enzyme was able to be easily purified from the supernatant of the expression culture 
by means of techniques known in the prior art. After having concentrated the proteins in the 
supernatant by a factor of about 200 by means of ultrafiltration tubes (Vivascience), the active 
CEL I-enzyme allowed to be used directly as a protein, which recognises and cleaves 
mismatch sequences. 

Since Pichia pastoris, as already mentioned, secretes very low amounts of protein, the 
expressed and secreted CEL I-enzyme was ready for employment without further processing 
or purification steps. 
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A specific activity assay assessing functionality is required to test the enzyme's capability to 
recognise all of the eight base mismatch combinations. 

In an exemplary manner, constructs were created for this purpose, which allowed to 
synthesise all of the eight mismatch combinations by means of the respective combinations of 
heterohybrids. For the application preferred in this method, the generation of these constructs 
was accomplished by the cloning of four oligonucleotides into the EcdKSJHindni cleaved 
pUC19 vector. These oligonucleotides only differed at one single base position (FIG. 7). The 
four cloned fragments were able to be used as defined templates for the amplification of 
fragments using fluorescence-labelled oligonucleotides directly taking part in heterohybrid 
formation. According to the combination of amplification targets in hetero-hybrid synthesis, 
all of the eight mismatch combinations possible allowed to be generated. 

The amplification of 237 bp fragments was accomplished by means of the fluorescence- 
labelled PUC19 F-primer 5'-FAM-GGATGTGCTGCAAGGCGAT-3' (SEQ ID No.5) and 
the fluorescence-labelled PUC19 R-primer 5 '-JOE-GTGAGTTAGCTCACTCATTAG-S 5 
(SEQ ID No.6). After heterohybrid formation of the partner fragments desired in the 
respective case was accomplished by a denaturation at 95°C for 10 min and subsequent 
gradual cooling, the activity assay was performed by incubating the heterohybrids with a 1:50 
dilution of the CEL I-extract from the yeast expression supernatant at 47°C for 10 min. 

CEL I cleaved one strand of the heterohybrids specifically at the site of the mismatch. After 
having applied the sample onto a GeneScan-gel, a 94 bp fragment and a 143 bp fragment were 
detected instead of the 237 bp fragments, as it was correspondingly the case for the opposite 
strand (FIG. 8). 

The enzyme produced in this method by means of the artificially synthesised gene exactly 
displays its desired property, i.e. the precise recognition of all possible mismatch 
combinations as well as the subsequent incision into one strand at the phosphodiester bond 
immediately 3' of the detected base mismatch (Leykowski et al., 1998). 
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A further aspect thus relates to the use of the recombinantly produced CEL I-enzyme 
according tothe invention for detecting.both point mutations as well as larger mutated regions 
like e.g. deletions/insertions. 

The appended sequence protocol shows: 

SEQ ID No. 1: primer Testf: S'-ATGACCAGACTGTACTCCGTGTTW, 
SEQ ID No. 2: primer AOX3': 5'-0K?AAATGGCATTCTGACATCC-3', 
SEQ ID No. 3: Sonde f: 5 '-ATGACCAGACTGTACTCCGTGTTC-3 *, 
SEQ ID No. 4: Sonde r: 5 '-GTCAGGGGTATCAATGAAATGTAA-3 
SEQ ID No. 5: F-primer: 5 '-FAM-GGATGTGCTGCAAGGCGAT-3 \ 
SEQ ID No. 6: fluorescence-labelled PUC19 R-primer: 

5 ' - JOE-GTGAGTTAGCTC ACTC ATTAG-3 ' , 
SEQ ID No. 7: nucleic acid sequence of the mature CEL I-enzyme according to the 
invention after redraft for expressing the enzyme in yeast, 
SEQ ID No. 8: amino acid sequence of the mature CEL I-enzyme, and 
SEQ ID No. 9: presentation of the complete nucleotide sequence of the synthetic CEL I- 
gene. 

The enclosed figures show: 

FIG. 1: A depiction of the nucleic acid sequence (SEQ ID No. 7) required for encoding the 
mature CEL I-enzyme after redraft for expressing the enzyme in yeast. The amino acid 
sequence (SEQ ID No. 8) is also presented. Furthermore presented are base deviations from 
the published original sequence, which are shown in grey characters. 

FIG. 2: A depiction of the complete nucleotide sequence of the synthetic CEL I-gene (SEQ ID 
No. 9). The sequence is given as a double strand, whereat the deoxyoligonucleotides 
necessary for synthesis are each printed in boldface on the respective strand. Moreover 
indicated are the sequence modifications like restriction sites, the Kozak sequence and the 
His-tag encoding sequences. The His-tag encoding sequence sections positioned at the N- 
terminus or the C-terminus are underlined; the underlined sequences were added either N- 
tenninally or C-terminally, but not at both termini. The oligonucleotides Test ffSonde f or 
Sonde r, which are required for several test experiments, are indicated by grey shading of the 
sequence. 
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FIG. 3: Schematic depiction concerning the synthesis of the artificial CEL I-gene by means of 
asymmetric PGR employing 16 overlapping deoxyoligonucleotides. 

FIG. 4: Schematic depiction of the vector pPIC9 (Invitrogen), which is preferably used in this 
invention. The figure shows the integration of the artificial CEL I-gene (about 100 bp) into 
the EcoRI-restriction site of the vector; also shown is the ohgonucleotide-primer AOX3 5 
required for the PCR-test. 

FIG. 5: PCR-result for the verification of the integration of the CEL I-gene into the genome of 
several yeast clones. Genomic DNA as a template was isolated from 20 yeast clones to be 
tested. Genomic DNA of two non-transformed yeast clones served as a template for the 
negative control (-). Purified vector-DNA of two original constructs served as a template for 
the positive control. The blank sample comprised water in order to exclude contaminations. 

16 clones of 20 clones to be tested are unambiguously positive. Corresponding to the positive 
controls, they show a band of about 1000 bp. Negative controls and blank sample are free of 
this signal. As a molecular weight marker (M) the "1 kb Plus DNA Ladder 5 ' (Gibco) is shown. 

FIG. 6: Result of the Southern hybridisation for the further verification of 12 yeast clones 
tested before by the PCR method. Both as a positive control and as a size control, the 262 bp 
CEL I-specific probe was hybridised to plasmid-DNA (+) (Construct shown in FIG. 4). 

As a negative control, the probe was hybridized to genomic yeast DNA of a clone containing 
the parental vector pPIC 9 without the CEL I-insert (-). 

FIG. 7: Construct for generating defined heterohybrids for performing a specific activity assay 
of the CEL I-enzyme. The two depicted synthetic oligonucleotides are constructed such, that 
they can be directly ligated into the EcoKUHindUI digested pUC19-vector after annealing 
(what is possible due to their complementary nature). Each of the two deoxyoligonucleotides 
is present in fourfold version. The letters Y and Z each symbolise all of the four possible 
bases. In consequence, all of the eight possible base mismatches 
(AA/TT/CC/GG/AC/AG/TC/TG) can be synthesised depending on the combination of the 
oligonucleotides. 
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FIG. 8: Exemplary result of the specific activity assay of the recombinant CEL I-enzyme in 
case of the base mismatch AA. The 137 bp PCR-product containing the mismatch is 
specifically incised at the mutation. Partly the Fam-labelled strand and partly the Joe-labelled 
strand is cleaved. Since the mutation is not exactly located in the middle of the PCR-product, 
the fragment is asymmetrically cleaved into a 94 bp fragment and a 143 bp fragment. 
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Claims: 

1. Method for producing a nucleic acid sequence, which codes for the complete CEL I- 
protein and allows to be recombinantly expressed in host cells, comprising the steps of: 

- Providing the sequence coding for the CEL I-protein from a suitable organism, in 
particular from Apium graveolens L., and 

- Adequately modifying the codon frequency of the sequence to be expressed in 
comparison to the native sequence, whereat this modification is performed with regard 
to the host organism to be used for expression, by means of 

a) Partition of the planned sequence into an even number x, especially 8, overlapping 
regions, 

b) Synthesis of 2x mutated oligonucleotides, in particular oligonucleotides 1-16 to 16- 
16, each of which comprise the entire length of one overlapping region of both strands 
of the coding sequence, 

c) First PCR-amplification in order to produce x/2, in particular 4, overlapping 
fragments under employment of the oligonucleotides of step b) 

d) Second PCR-amplification in order to produce x/4, in particular 2, overlapping 
regions under employment of the fragments of step c) 

e) Third PCR-amplification in order to produce x/8, in particular 1, fragment, which 
comprises the coding region of CEL I. 

2. Method according to claim 1, characterised in that instead of step-e) the following steps 
are performed: 

e') Cloning the fragments generated in step d) into a suitable vector, and 

f) Appropriately digesting the vectors and ligating the fragments in order to fuse said 
fragments to form the complete fragment comprising the coding region of CEL L 

3. Method according to claim 1 or 2, characterised in that the codon frequency of the 
sequence to be expressed is modified according to the codon frequency of yeast. 

4. Method according to any one of the claims 1 to 3, characterised in that furthermore 
nucleotides are attached that encode for additional N-terminal or C-terminal amino acid 
tags, in particular tags being comprised of 6 histidines. 
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5. Method according to any one of the claims 1 to 4, characterised in that the 
oligonucleotides synthesised in step (c) have an average length of 70 nucleotides and in 
each case overlap at about 20 bases. 

6. Method for producing a recombinant, complete CEL I-protein from Apium graveolens 

u 

comprising 

- performing the method according to any one of the claims 1 to 5, and 

- expressing the nucleic acid sequence by means of a suitable expression system. 

7. Method according to claim 6, characterised in that a vector is used as an expression 
system, whereat this vector is selected from the pPIC 9, pPIC 3, 5 and pQE-vectors. 

8. Method according to claim 6 or 7, characterised in that the nucleic acid sequence is 
expressed in a host cell selected from Hansenula polymorpha, Pichia pastoris, 
Saccharomyces cerevisiae, HeLa-cells, CHO-cells, Toxoplasma gondii and Leishmania. 

9. Method according to claim 8, characterised in that the employed Pichia pastoris strain is 
the strain GS1 15. 

10. Recombinant, complete CEL I-protein from Apium graveolens L., produced according 
to any one of the claims 6 to 9. 

11. Complete DNA-sequence of the CEL I-protein suitable for the expression of the CEL I- 
protein or expressible parts of this sequence derived from Apium graveolens, obtainable 
by a method according to any one of the claims 1 to 5. 

12. DNA-sequence of the CEL I-protein of Apium graveolens according to claim 11, 
characterised in that furthermore nucleotides are attached that encode for additional N- 
tenninal or C-terminal amino acid tags, in particular tags being comprised of 6 
histidines. 
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13. DNA-sequence of the CEL I-protein of Apium graveolens according to claim 1 1 or 12, 
characterised in that the sequence at its both ends provides restriction sites for restriction 
endonucleases, which are absent in the remaining sequence and in a vector to be 
employed. 

14. Complete DNA-sequence of the CEL I-protein suitable for the expression of the CEL I- 
protein of Apium graveolens according to SEQ ID No. 7 or parts of this sequence. 

15. Host organism expressing a DNA-sequence according to any one of the claims 1 1 to 14. 

16. Host organism according to claim 15, selected from Hansenula polymorphs Pichia 
pastoris, Saccharomyces cerevisiae, HeLa-cells, CHO-cells, Toxoplasma gondii and 
Leishmania. 

17. Host organism according to claim 16, characterised in that the employed Pichia pastoris 
strain is the strain GS 1 15. 

18. Use of the recombinant^ produced CEL I-enzyme according to claim 10 for detecting 
point mutations as well as larger mutated regions like e.g. deletions/insertions. 
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MTRLYSVFF 
ATGACCAGACTGTACTCCGTGTTCTT 
GC T A T T 

LLLALVVE PGVRAWS KEGHVM 
TCTACTTCTCGCCCTTGTCGTGGAGCCCGGTGTAAGAGCTTGGTCAAAGGAAGGACATGTTA 
TTGTGT ATAG T C AGC A C C 

TC QIAQDLL EPEAAHAVKML 
TGACCTGTCAGATTGCCCAGGACCTTCTTGAGCCAGAAGCCGCTCATGCGGTAAAGATGTTG 
AAGTGTG A A T C 

LPDYANGNLSSLCVWPDQIRH 
TTGCCTGATTATGCCAACGGAAACTTATCAAGCCTATGTGTTTGGCCAGATCAGATCCGTCA 
AGC T T C GTCG G G T ATA 

WYKYRWTSSLHFI DT P DQACS 
TTGGTACAAGTACCGTTGGACCTCCTCCTTACATTTCATTGATACCCCTGACCAAGCATGTT 
C AG TAG TC C C A T C 

FDYQRDCHDPHGGKDM CVAG 
CCTTTGACTATCAACGTGACTGTCATGATCCCCATGGTGGGAAGGACATGTGCGTTGCCGGC 
ATCGAA AA TTA 

A I QN F T S QL GH F R H G F S D RRY 
GCGATTCAAAATTTCACCTCTCAATTGGGGCATTTCCGTCACGGTACAAGTGATAGGCGATA 
C A GC T A CTATCCT 

N MT EAL L F L S H FM G D I H Q PMH 
CAATATGACT GAAGCTT TGCTCTTCCTTTCACACTTCATGGGAGACATTCATCAACCTATGC 
T A G TATTAC T G 

VGFT S DMGGNS I DL RWFRHK 
ATGTGGGATTTACTTCCGACATGGGCGGTAATAGTATTGATTTGAGGTGGTTTCGTCATAAA 
T AAGT T AAC A CC C C 

SNLHHVWDREIILTAAADYHG 
TCAAACCTGCATCACGTCTGGGATCGAGAGATCATTCTAACTGCTGCTGCTGATTATCACGG 
C CTT A T TA AA CT 

KDMHSLLQDIQRNFTEGSWLQ 
AAAGGATATGCATTCCTTGCTTCAAGACATTCAGAGAAATTTTACGGAGGGTTCTTGGTTGC 
T CTCCA AGC A AG 

DVESWKECDDISTCANKYAK 
AAGACGTGGAATCTTGGAAAGAATGCGATGATATCTCTACTTGTGCAAACAAATATGCTAAG 
TT C G T C CTG 

E S I KLACNWGYKDVE S GE TLS 
GAGTCAATTAAGTTGGCTTGTAACTGGGGGTATAAGGATGTAGAAAGTGGAGAGACATTGTC 
AGT A AC AC TCA T TC C A TC 

DKYFNTRMPIVMKRIAQGGIR 
GGATAAATATTTTAACACGCGAATGCCAATTGTTATGAAACGTATCGCCCAAGGAGGAATCA 
A CCAA C GATGT C 

LSMILNRVLGSSADHSLA 
GATTAAGCATGATTCTTAACCGTGTCCTGGGTTCGTCTGCTGACCATTCGTTGGCATAATAA 
T TC T G ATT AAGC CAT T 
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Fig. :2 

BcoRI Kozak 6 x His Oligo 1 Oligo Test f /Sonde f-* 

tacqac<7aatfccacc ATGGGACATCACCATCATCACCACAT^^ 

atqctQCttaaqtgq TACCCTGTAGTGGTAGTAGTGGTGTATCTTCCTTCTT ACTGGTCTGACATGAGGCACAAGAAAGATGAAGAGCGGGAAC 

Oligo 2 

Oligo 3 

TCGTGGAGCCCGGTGTAAGGGCTTGGTCAAAGGAAGGACATGTTATGACCTGTCAGATTGCCC^GGACCTTCTTGAGCCAGAAGCCGCTCATGC 
AGCACCTCGGGCCACATTCCCGAACCAGTTTCCTTCCTGTACAATACTGGACAGTCTAACGGGTCCTGGAAGAACTCGGTCTTCGGCGAGTACG 



Oligo 5 

GGTAAAGATGTTGTTGCCTGATTATGCCAACGGAAACTTATCAAGCCTATGTGTTTGGCCAGATCAGATCCGTCATTGGTACAAGTACCGTTGG 
CCATTTCTACAACAACGGACTAATACGGTTGCCTTTGAATAGTTCGGATACACAAACCGGTGTAGTCTAGGCAGTAACCATGTTCATGGCAACC 
Oligo 4 



ACCTCC TCCTTACATTTCATTGATACTCCTGA CCAAGCATGTTCCTTTGACTATCAACGTGACTGTCATGATCCCCATGGTGGGAAGGACATGT 
TGGAGGA| |ffjffl|fi | ^^ 

«— Oligo Sonde r Oligo 6 



Oligo 7 Hindi I I 

GCGTCGCCGGCGCGATTCAAAATTTCACCTCTCAATTGGGGCATTTCCGTCACGGTACAAGTGATAGGCGATACAATATGACTCaAGCTITGCT 
CGCAACGGCCGCGCTAAGTTTTAAAGTGGAGAGTTAACCCCGTA^ 

Oligo 8 . 



Oligo 9 

CTTCCTTTCACACTTCATGGGAGACATTCATCAACCTATGCATGTGGGATTTACTTCCGACATGGGCGGTAATAGTATTGATTTGAGGTGGTTT 
GAAGGAAAGTGTGAAGTACCCTCTGTAAGTAGTTGGATACGTACACCCTAAATGAAGGCTGTACCCGCCATTATCATAACTAAACTCCACCAAA 

Oligo 10 



Oligo 11 

CGTCATAAATCAAACCTGCATCACGTCTGGGATCGAGAGATCATTCTAACTGCTGCTGCTGATTATCACGGAAAGGATATGCATTCCTTGCTTC 
GCAGTATTTAGTTTGGACGTAGTGCAGACCCTAGCTCTCTAGTAAGATTGACGACGACGACTAATAGTGCCTTTCCTATACGTAAGGAACGAAG 



Oligo 13 

AAGACATTCAGAGAAATTTTACGGAGGGTTCTTGGTTGCAAGACGTGGAATCTTGGAAAGAATGCGATGATATCTCTACTTGTGCAAACAAATA 
TTCTGTAAGTCTCTTTAAAATGCCTCCCAAGAACCAACGTTCTGCACCTTAGAACCTTTCTTACGCTACTATAGAGATGAACACGTTTGTTTAT 
Oligo 12 



TGCTAAGGAGTCAATTAAGTTGGCTTGTAACTGGGGGTATAAGGATGTAGAAAGTGGAGAGACATTGTCGGATAAATATTTTAACAGCGAATGC 
ACGATTCCTCAGTTAATTCAACCGAACATTGACCCCCATATTCCTACATCTTTCACCTCTCTGTAACAGCCTATTTATAAAATTGTGCGCTTAC 

Oligo 14 



Oligo 15 

CAATTGTTATGAAACGTATCGCCCAAGGAGGAATCAGATTAAGCATGATTCTTAACCGTGTCCTGGGTTCGTCTGCTGACCATTCGTTGGCAGG 
GTTAACAATACTTTGCATAGCGGGTTCCTCCTTAGTCTAATTCGTACTAAGAATTGGCACAGGACCCAAGCAGACGACTGGTAAGCAACCGTCC 

Oligo 16 



6 x HIS 2 x Stop EcoRI/XhoI 
TGGTCATCACCATCATCACCACTAATAAGAATTCtCCraocabcarrh 
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Fig. : 6 



9 10 11 13 14 15 16 



- / 



s .,-1 ': ' > 

I ■ Sit ' 



WO 2004/035771 



7/8 



PCT/EP2003/011210 



Fig. :7 



pUC 19 



pUC19 



nt391 

CCAGTG 

GGTCACTTAA 



nt453 
AGCTTG 
AC 




EcoRI 

AATTCT ACG ATGCCGCTA 
GA TGC TAC GGC GAT 



Hindm 

VAGTGATCT GTCGA 
TC ACT AGA CAG CTT CGA 
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Fig. :8 
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SEQUENCE LISTING 
<110> Biopsytec Analytik GmbH 

<120> Mutated nucleic acid of a Cell-endonuclease and methods for 
producing the 'recombinant full-length Cell-protein 

<130> B30098PCT 

<160> 9 

<170> Patentln version 3.2 

<210> 1 

<211> 24 

<212> DNA 

<213> Artificial 

<220> 

<223> Initial sequence from Apium graveolens, codon usage fitted to 
Pichia pastoris 

<400> 1 

atgaccagac tgtactccgt gttc 24 

> 

<210> 2 
<211> 21 
<212> DNA 
<213> Artificial 

<220> 

<223> Initial sequence from Apium graveolens, codon usage fitted to 
Pichia pastoris 

<400> 2 

gcaaatggca ttctgacatc c 21 

<210> 3 

<211> 24 

<212> DNA 

<213> Artificial 

<220> 

<223> Initial sequence from Apium graveolens, codon usage fitted to 
Pichia pastoris 

<400> 3 

atgaccagac tgtactccgt gttc 24 



<210> 4 

<211> 24 

<212> DNA 

<213> Artificial 

<220> 

<223> Initial sequence from Apium graveolens , codon usage fitted to 
Pichia pastoris 



<400> 4 
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<210> 5 

<211> 19 

<212> DNA 

<213> Artificial 

<220> 

<223> Initial sequence from Apium graveolens, codon usage fitted to 
Pichia pastoris . 

<400> 5 

ggatgtgctg caaggcgat 19 



<210> 6 

<211> 21 

<212> DNA 

<213> Artificial 

<220> 

<223> Initial sequence from Apium graveolens, codon usage fitted to 
Pichia pastoris 

<400> 6 

gtgagttagc tcactcatta g 21 



<210> 7 

<211> 895 

<212> DNA 

<213> Artificial 



<220> 

<223> Initial sequence from Apium graveolens, codon usage fitted to 
Pichia pastoris 



<400> 7 
atgacgcgat 


tatattctgt 


gttctttctt 


ttgttggctc 


ttgtagttga 


accgggtgtt 


60 


agagcctgga 


gcaaagaagg 


ccatgtcatg 


acatgtcaaa 


ttgcgcagga 


tcttgttgga 


120 


gccagaagca 


gcacatgctg 


taaagatgct 


gttaccggac 


tatgctaatg 


gcaacttatc 


180 


gtcgctgtgt 


gtgtggcctg 


atcaaattcg 


acactggtac 


aagtaccgtt 


ggacctcctc 


240 


cttacatttc 


attgataccc 


ctgaccaagc 


atgttccttt 


gactatcaac 


gtgactgtca 


300 


tgatccccat 


ggtgggaagg 


acatgtgcgt 


tgccggcgcg 


attcaaaatt 


tcacctctca 


360 


attggggcat 


ttccgtcacg 


gtacaagtga 


taggcgatac 


aatatgactg 


aagctttgct 


420 


cttcctttca 


cacttcatgg 


gagacattca 


tcaacctatg 


catgtgggat 


ttacttccga 


480 


catgggcggt 


aatagtattg 


atttgaggtg 


gtttcgtcat 


aaatcaaacc 


tgcatcacgt 


540 


ctgggatcga 


gagatcattc 


taactgctgc 


tgctgattat 


cacggaaagg 


atatgcattc 


600 


cttgcttcaa 


gacattcaga 


gaaattttac 


ggagggttct 


tggttgcaag 


acgtggaatc 


660 


ttggaaagaa 


tgcgatgata 


tctctacttg 


tgcaaacaaa 


tatgctaagg 


agtcaattaa 


720 
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ttttaacacg cgaatgccaa ttgttatgaa acgtatcgcc caaggaggaa tcagattaag 840 
catgattctt aaccgtgtcc tgggttcgtc tgctgaccat tcgttggcat aataa 895 

<210> 8 
<211> 296 
<212> ,PRT 

<213> Apium graveolens 
<400> 8 

Met Thr Arg Leu Tyr Ser Val Phe Phe Leu Leu Leu Ala Leu Val Val 
15 10 15 

Glu Pro Gly Val Arg Ala Trp Ser Lys Glu Gly His Val Met Thr Cys 
20 25 30 

Gin He Ala Gin Asp Leu Leu Glu Pro Glu Ala Ala His Ala Val Lys 
35 40 45 

Met Leu Leu Pro Asp Tyr Ala Asn Gly Asn Leu Ser Ser Leu Cys Val 
50 55 60 

Trp Pro Asp Gin He Arg His Trp Tyr Lys Tyr Arg Trp Thr Ser Ser 
65 70 75 ' ~ 80 

Leu His Phe He Asp Thr Pro . Asp Gin Ala Cys Ser Phe Asp Tyr Gin 
85 90 95 

Arg Asp Cys His Asp Pro His Gly Gly Lys Asp Met Cys Val Ala Gly 
100 105 HO 

Ala He Gin Asn Phe Thr Ser Gin Leu Gly His Phe Arg His Gly Phe 
115 120 125 

Ser Asp Arg Arg Tyr Asn Met Thr Glu Ala Leu Leu Phe Leu Ser His 
130 135 140 

Phe Met Gly Asp He His Gin Pro Met His Val Gly Phe Thr Ser Asp 
145 150 155 J 160 

Met Gly Gly Asn Ser He Asp Leu Arg Trp Phe Arg His Lys Ser Asn 
165 170 175 

Leu His His Val Trp Asp Arg Glu He He Leu Thr Ala Ala Ala Asp 
180 185 190 



Tyr His Gly Lys Aso Met His Sp.r T.en t. B u 
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Phe Thr Glu Gly Ser Trp Leu Gin Asp Val Glu Ser Trp Lys Glu Cys 
210 215 220 



Asp Asp lie Ser Thr Cys Ala Asn Lys Tyr Ala Lys Glu Ser He Lys 
225 230 235 240 



Leu Ala Cys Asn Trp Gly Tyr Lys Asp Val Glu Ser Gly Glu Thr Leu 
245 250 * 255 



Ser Asp Lys Tyr Phe Asn Thr Arg Met Pro He Val Met Lys Arg He 
260 265 270 



Ala Gin Gly Gly He Arg Leu Ser Met He Leu Asn Arg Val Leu Gly 
275 280 285 



Ser Ser Ala Asp His Ser Leu Ala 
290 295 



<210> 9 

<211> 986 

<212> DNA 

<213> Artificial 



<220> 

<223> Initial sequence from Apium graveolens, codon usage fitted to 
Pichia pastoris 



<400> 9 
tacgacgaat 


tcaccatggg 


acatcaccat catcaccaca 


tagaaggaag aatgaccaga 


60 


ctgtactccg 


tgttctttct 


acttctcgcc cttgtcgtgg 


agcccggtgt aagggcttgg 


120 


tcaaaggaag 


gacatgttat 


gacctgtcag attgcccagg 


accttcttga gccagaagcc ; 


180 


gctcatgcgg 


taaagatgtt 


gttgcctgat tatgccaacg 


gaaacttatc aagcctatgt 


240 


gtttggccag 


atcagatccg 


tcattggtac aagtaccgtt 


ggacctcctc cttacatttc 


300 


attgatactc 


ctgaccaagc 


atgttccttt gactatcaac 


gtgactgtca tgatccccat 


360 


ggtgggaagg 


acatgtgcgt 


cgccggcgcg attcaaaatt 


tcacctctca attggggcat 


420 


ttccgtcacg 
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